Efficient Pitch-based Estimation o

نویسنده

  • Arlo Faria
چکیده

To reduce inter-speaker variability, vocal tract length normalization (VTLN) is commonly used to transform acoustic features for automatic speech recognition (ASR). The warp factors used in this process are usually derived by maximum likelihood (ML) estimation, involving an exhaustive search over possible values. We describe an alternative approach: exploit the correlation between a speaker’s average pitch and vocal tract length, and model the probability distribution of warp factors conditioned on pitch observations. This can be used directly for warp factor estimation, or as a smoothing prior in combination with ML estimates. Pitch-based warp factor estimation for VTLN is effective and requires relatively little memory and computation. Such an approach is well-suited for environments with constrained resources, or where pitch is already being computed for other purposes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient pitch-based estimation of VTLN warp factors

To reduce inter-speaker variability, vocal tract length normalization (VTLN) is commonly used to transform acoustic features for automatic speech recognition (ASR). The warp factors used in this process are usually derived by maximum likelihood (ML) estimation, involving an exhaustive search over possible values. We describe an alternative approach: exploit the correlation between a speaker’s a...

متن کامل

Efficient three-stage pitch estimation for packet loss concealment

This paper presents a low-complexity pitch estimation algorithm for packet loss concealment. The algorithm divides the pitch estimation into three stages with each additional stage providing further accuracy. Compared with a system based on G.711 Appendix I, the proposed algorithm requires approximately 32 percent fewer cycles on a DSP processor integrated in a Bluetooth chip. Furthermore, obje...

متن کامل

A Computationally Efficient Method for Polyphonic Pitch Estimation

This paper presents a computationally efficient method for polyphonic pitch estimation. The method employs the Fast Resonator Time-Frequency Image (RTFI) as the basic time-frequency analysis tool. The approach is composed of two main stages. First, a preliminary pitch estimation is obtained by means of a simple peak-picking procedure in the pitch energy spectrum. Such spectrum is calculated fro...

متن کامل

Multiple fundamental frequency estimation based on sparse representations in a structured dictionary

a r t i c l e i n f o a b s t r a c t Automatic transcription of polyphonic music is an important task in audio signal processing, which involves identifying the fundamental frequencies (pitches) of several notes played at a time. Its difficulty stems from the fact that harmonics of different notes tend to overlap, especially in western music. This causes a problem in assigning the harmonics to...

متن کامل

Robust and efficient pitch estimation using an iterative ARMA technique

In this article, we propose an innovative way of estimating pitch from speech waveform data, using an iterative ARMA technique that efficiently estimates multiple frequency components of a time series. Additionally, the harmonic structure of voiced speech and the smoothness of its pitch period are incorporated into the iterative ARMA technique, and this novel integration results in an efficient...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005